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ABSTRACT 

We propose a method for the characterization of 
the local intrinsic curvature of adsorbed DNA 
molecules. It relies on a novel statistical chain de- 
scriptor, namely the ensemble averaged product of 
curvatures for two nanosized segments, symmetric- 
ally placed on the contour of atomic force micros- 
copy imaged chains. We demonstrate by theoretical 
arguments and experimental investigation of repre- 
sentative samples that the fine mapping of the 
average product along the molecular backbone gen- 
erates a characteristic pattern of variation that 
effectively highlights all pairs of DNA tracts with 
large intrinsic curvature. The centrosymmetric char- 
acter of the chain descriptor enables targetting 
strands with unknown orientation. This overcomes 
a remarkable limitation of the current experimental 
strategies that estimate curvature maps solely 
from the trajectories of end-labeled molecules or 
palindromes. As a consequence our approach 
paves the way for a reliable, unbiased, label-free 
comparative analysis of bent duplexes, aimed to 
detect local conformational changes of physical or 
biological relevance in large sample numbers. 
Notably, such an assay is virtually inaccessible 
to the automated intrinsic curvature computation 
algorithms proposed so far. We foresee several 
challenging applications, including the validation of 
DNA adsorption and bending models by experi- 
ments and the discrimination of specimens for 
genetic screening purposes. 



INTRODUCTION 

Atomic force microscopy (AFM) is nowadays routinely 
used to resolve the contour of adsorbed DNA molecules 
with nanoscale resolution and support in this way of im- 
portant advances in fundamental and applied studies in 
biophysics, molecular biology, genetics, genomics and 
nano biomedicine. Notably, through the analysis of 
contours one can map the DNA intrinsic curvature and 
flexibility along the molecular backbone (1-6). This tech- 
nique is particularly suited to address at the experimental 
level the impact of base-pair sequence on the local con- 
formation of the strands (1,3-8) and also plays a pivotal 
role for investigations attempting to relate the inherent 
DNA shape and flexibility to other physical and biological 
properties, such as melting (9), ligand interactions (10-12), 
replication (13), genomic packaging and transcription 
regulation (14). 

The wide applicability of AFM-based curvature studies 
demands simple and reliable experimental methods, 
characterized by a few processing steps for specimen prep- 
aration and minimum experimental bias on intrinsic 
curvature measurements. These requirements are even 
more important in view of the introduction of effective 
assays for DNA analysis fully based on AFM imaging 
[e.g. sizing (15), genotyping and haplotyping (16), expres- 
sion profiling (17)]. Such a step may lead to envision a key 
role for the nanoscale curvature analysis within more 
complex protocols, setting up population-based genetic 
disease studies or solving genomic screening problems at 
the single-molecule level. 

Current methods for the estimation of the local intrinsic 
curvature and flexibility start from the evidence that DNA 
is a long molecule whose conformation is constantly 
fluctuating under thermal perturbations while adsorbing 
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from a bulk solution onto a solid support. This suggests to 
adopt a statistical approach, based on high-resolution 
imaging and computer-assisted tracing of adsorbed mol- 
ecules, in order to sample with accuracy the ensemble of 
accessible DNA conformations. These steps are followed 
by the estimation of the signed curvature on individual 
contours and by an ensemble averaging process, con- 
ducted to separate the intrinsic (static) curvature contri- 
bution from the thermally activated (dynamic) one (3). 
A condition for the correct conformational average is 
the need for a proper alignment of each contour of the 
ensemble, i.e. it is necessary to identify (i) which of the 
two contour ends corresponds to the starting point of 
the base-pair sequence between the two alternative 
choices and (ii) which of the two contour orientations 
with mirror curvature profiles reflect the actual helical 
region exposed by the adsorbed chain to the substrate. 
Detection of chain polarity was traditionally solved by 
end-labeling with bulky tags, e.g. streptavidin, 
streptavidin-ferritin, infrared dyes or colloidal gold 
(2,4,16,18). These procedures can, however, perturb the 
overall conformation of adsorbed molecules and represent 
indeed a time-consuming, labor-intensive part of the 
whole experiment. A different solution consists on the 
preparation of palindromic constructs starting from the 
target molecules (3). This has the advantage of bypassing 
the need for chain polarity discrimination due to the dyad 
symmetry of the base-pairs sequence. Despite such 
progress, specimen preparation issues very likely hamper 
the broad applicability of similar studies: as a matter of 
fact, AFM-based curvature maps are so far limited in 
number (1-6,8,18-21) and mostly related to the ana- 
lysis of few model systems [e.g. pBR322 DNA in 
(2,3,18,20,21)]. 

Here we propose a novel method for the characteriza- 
tion of the local intrinsic curvature, which is inspired by 
the aforementioned works on computer-assisted tracing of 
adsorbed molecules (1-6), yet employs a new statistical 
chain descriptor: the ensemble averaged product of curva- 
tures for two nanosized segments symmetrically placed on 
the contour of AFM-imaged chains. This peculiar choice 
results in a centrosymmetric statistical quantity that 
enables to target adsorbed strands with unknown orienta- 
tion. Accordingly, conformational averages are calculated 
without the need for a proper alignment of the AFM- 
imaged molecular trajectories. In particular, we dem- 
onstrate by theoretical arguments from polymer chain 
statistics that the fine mapping of the average curvatures 
product (CP) along the molecular backbone effectively 
highlights all pairs of DNA tracts with large intrinsic 
curvature. Such conclusion is further supported by the 
direct investigation of representative specimens from the 
promoter region of the human osteopontin (OPN) coding 
gene and the successful comparison of experimental 
findings with simulations based on well-known DNA 
bending models. We finally contrast the results of the 
novel method with those obtained by the automated in- 
trinsic curvature computation algorithms proposed so far 
(19). The superior response offered by our method, in 
terms of robustness, accuracy, flexibility and widespread 
applicability, justifies its potential use in novel, label-free 



comparative assays of bent duplexes, aimed to detect local 
conformational changes of physical or biological rele- 
vance in large sample numbers. The method is intended 
to involve 10 2 -10 3 bp long fragments that can be readily 
prepared for AFM imaging. 



MATERIALS AND METHODS 

Sample preparation 

DNA samples were obtained by PCR amplification of the 
regulatory region of the OPN encoding gene, as described 
previously (22); amplicons were purified in 1% (w/w) 
agarose gel and electroeluted, then the solution was 
treated with phenol/chloroform followed by ethanol pre- 
cipitation. The pellet was stabilized in Tris-EDTA buffer 
and stored at — 20°C. DNA concentration, determined by 
absorbance at 260 nm, was in the range of 100 nM. 
Conventional haplotype analysis allowed us to focus pri- 
marily on a 1332 bp specimen with the nucleotide sequence 
reported in Figure 1. Importantly, this template does not 
contain extended strings of phased A-tracts or other 
prominent nucleotide sequences (e.g. periodic A n /T n 
groups) that could introduce anomalously large bends 
in the adhered DNA molecules (1-3,8) and bias our 
proof-of-principle investigation. For comparative 
purposes, a second 1335 bp specimen with point mutations 
at four, well-known polymorphic loci was also considered 
(see Supplementary Data). With the exception of the 
last subsection (devoted to contrast both samples), all ex- 
perimental and theoretical results reported below refer to 
the 1332 bp sample. 

The DNA adsorption was carried out according to the 
standard protocols reported in literature (23). A 20 ul 
aliquot of solution containing 4mM Hepes (pH 7.4), 
4-10 mM MgCl 2 and 2nM DNA was deposited onto 
freshly cleaved muscovite mica (Agar Scientific); the 
sample drop was incubated for about 120 s and rinsed 
with MilliQ water. The surface was finally dried with a 
gentle stream of nitrogen. 

Characterization of DNA local intrinsic curvature 

Common practice in AFM studies on DNA structure and 
flexibility dictates to prepare specimens by DNA adsorp- 
tion from an aqueous solution onto an atomically flat 
substrate. This is followed by high-resolution imaging of 
adsorbed species and by the use of an image-analysis 
software in order to reconstruct the molecular profiles 
and analyze the signed curvature associated to segments 
of given location and length (2-5,8,24). Tracing algo- 
rithms represent each molecule as a chain of xy pairs 
separated by a fixed contour length /. The curvature 
analysis along a generic trajectory proceeds through the 
calculation of the signed bending angles formed by the 
adjacent units, that are obtained from the vector product 
of the local tangent vectors F, and U+\ (i — 1,2, . . . ,N— 1 
with N total number of units) (3). From the values 
one can define the global curvature Cj >m for a segment 
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GTGACTGCCT 
TTTGCATCTA 
TGACCTAGGT 
GGTTGCACAG 
GCAGTCATCC 
TTTCATATAG 
AAATCACAAA 
ATAGCCTTCT 
CTTAAACCGA 
TTTCATGGGA 
GATATTGTAC 
ACAAAACCAG 
TCCGCCTCCC 
GCAGCAGGAG 
CCAAACGCCG 
TCCAAATTCT 
ATAAGCAGAT 
TTAAAACTTA 
AAAATTAAAT 



GCCCCTCTTA 
ATATGTGCTA 
AATAGTATTG 
GTCAGCAGTG 
TTCTCTCAGT 
AATTTTATAT 
GCTAAGCTTG 
GGCTCTTCAA 
AAGAAACAAA 
TCCCTAAGTG 
ATAAGTAATG 
AGGGGGAAGT 
TGTGTTGGTG 
GAGGCAGAGC 
ACCAAGGTAC 
AAGGAAAAAT 
TAGATACATT 
AAAATACTTC 
TCACCTTTGG 



AAAATTTCAT 
AGCATTGCTA 
CATTTCATGG 
ACACAGCGGA 
CAGAAACTGC 
TTTAATGTCA 
AGTAGTAAAG 
TAAGTACAAT 
AATCCATTGT 
CTCTTCCTGG 
TTTTAACTGT 
GTGGGAGCAG 
GAGGATGTCT 
ACAGCATCGT 
AGCTTCAGTT 
ATTTTTAATT 
GCAGGTCTCC 
CACTGGGTCC 
AATAATTATA 



AATAGTTAAC 
GTTTAACATA 
ATGAGGGAAC 
ATTCAGAACC 
TTTACTTCTG 
CTAGTGCCAT 
GACAGAGGCA 
CATACAGGCA 
ATTTAATTTT 
ATGCTGAATG 
AGATTGTGTG 
GTGGGCTGGG 
GCAGCAGCAT 
CGGGACCAGA 
TGCTACTGGG 
GTAATGCTGT 
TGGAACAAAG 
TCAAAAGAAC 
CCTATATAAT 



ACACATATAG 
CTAATTCATT 
AAGGATAGGT 
ACGGTCTGGC 
CAACATCTAG 
TTGTCTAAGT 
AGTTCTCTGA 
AGAGTGGTTG 
ACATTAATGT 
CCCATCCCGT 
TGTGCGTTTT 
CAGTGGCAGA 
TTAAATTCTG 
CTCGTCTCAG 
TTGTGCATTC 
TAAACAGACT 
GTGTCTAGAT 
GGAAACCACC 
TTTCAGTGGG 



TCCTTAAGAT 
TAAACCCCTC 
AGGCTGGGCG 
TCCTGAAGCA 
AATAAATTAC 
AACAAGCTAC 
ACTCCTTGCA 
CAGATATTAC 
TTTTCCCTAC 
AAATGAAAAA 
TGTTTTTTTT 
AAACCTCATG 
GGAGGGCTTG 
GCCAGTTGCA 
AGCTGAATTT 
TAAATTTTCT 
ATTTTGAATG 
GATGCTAATC 
GTACTGTGCA 
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ATTACAATTC 
ACGCAGAGCA 
AAAAACCCCA 
ATTTGCCCAA 
GCCCTCTCAA 
CATTCTTCTA 
TGCATACTCG 
GGCTTGAACA 
CTTTATGTTA 
TTTCTCCCTT 
GCTAGTTAAT 
TGTTTTAACC 
ACACAATCTC 
GTTGTCAGCA 
GCCTTCTCAG 
CATGGGGAAG 
AGCCTTTTTA 
CCAATCAAAT 
AGAAAATAGT 
GG 

Figure 1. Sequence of the 1332 bp sample used in AFM experiments. 



of m units, located at j units from one of the ends, as: 

j+m— 1 

Cj,m = £ °> W 
i=j+\ 

with j, m = 1,2, . . . , N (Figure 2a). 

It is well known that the representation of DNA mol- 
ecules by segmental chains allows to compare experimen- 
tal findings with predictions of polymer chain statistics, 
particularly with those of the worm-like chain (WLC) 
model in its discrete formulation (1-3,18,19,23,24). In 
this context, DNA is modeled by a chain of virtual 
bonds of length / connected by torsional-spring vertices, 
that are energetically uncorrelated and characterized 
by a harmonic local bending-energy function 
E(0f) = \/2k B T(^/l)(e) h ) 2 (with k B Boltzmann constant, 
T absolute temperature, § » 50 nm persistence length and 
Of 1 are thermally induced angular fluctuations occurring 
around the constant sequence-dependent angles) (1). 
This corresponds to represent as the sum of static and 
dynamic contributions, i.e. f? ( - = 6P+9) h , where 9) b angles 
are normally distributed with null mean value and 
standard deviation «JTj%. Thus the WLC model predicts 
that the average value of the Cj >m curvature is: 

j+m— 1 j+w— 1 

(cj, m )= £ m= £ (2) 

,■=/■+! i=J+\ 



where the angle brackets () denote an ensemble average 
conducted over the accessible chain conformations. 
Equation (2) proves that the average curvature (Cj ;W ) 
equals the intrinsic curvature °f tne segment. 

Furthermore, it suggests a route for comparing the experi- 
mental values of intrinsic curvature with the theoretical 
ones: in fact the left-hand term might be experimentally 
accessed by averaging the Cj >m realizations over a large 
pool of AFM-imaged molecular contours, whereas the 
right-hand term should be predicted computationally 
by well-consolidated methods [e.g. refs. (2-4,18,25)]. 
Unfortunately, the practical estimation of (C/ >m ) is a 
non-trivial task since it requires to orientate the sampled 
molecular contours in order to evaluate the curvature 
average on corresponding points of the nucleotide 
sequence. In general, for each molecular contour extracted 
from an AFM image there are four possible spatial orien- 
tations, depending on which of the two contour ends cor- 
respond to the starting point of the base-pair sequence 
(the 5' — 3' direction) and which of the two chemically 
different faces are exposed by the molecule to the substrate 
when collapsing on it from the bulk solution (Figure 2b). 
For the case of unlabeled chains, their orientation uncer- 
tainty cannot be solved deterministically because there 
are no distinctive topographical features that allow to dis- 
tinguish the beginning from the end of a DNA molecule 
in an AFM image. Nevertheless, a non-deterministic 
approach can be carried on, based on the assumption 
that the recorded contours share the same orientation 
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(a) 




Reverse adsorbed DNA face 




s 



Figure 2. (a) Main steps of curvature analysis for a DNA chain. A molecule is imaged by AFM then traced by an image-analysis software and 
represented as a chain of xy pairs separated by a contour length /. The signed bending angle 9, is obtained from the vector product of the local 
tangent vectors ti and t i+ \. (b) Generally speaking, we can ascribe four different spatial orientations to the extracted contour of a label-free molecule, 
according to the end chosen as the starting point of the nucleotide sequence (red dot) and the molecular face exposed to the substrate. As a result, the 
signed curvature Cj, m changes in modulus and/or sign according to the chosen orientation. On the contrary, the CP P hm is estimated by coupling 
the signed curvatures of two m-units long segments, symmetrically placed at j units from chain ends. Such quantity remains the same for each one of 
the possible orientations of the extracted contour, (c) The characteristic patterns of variation of (P s ,l) f° r some specimens — here named (1-3) — 
enable to highlight DNA regions with different intrinsic curvature (gray box). This represents an original strategy to establish the comparative 
analysis of bent duplexes under label-free conditions. 
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when we observe the minimal value of the ensemble 
averaged curvature variance at each point along the mo- 
lecular trajectory; in this case automated computational 
algorithms are used to iteratively flip the orientation of 
the extracted molecular profiles in search for the 
minimum value of the overall curvature variance (19). 
An extensive discussion on this protocol is reported below. 

Whenever the imaged chains are end-labeled with a 
structurally distinctive tag, the beginning and the end of 
the nucleotide sequence (i.e. the chain polarity) are easily 
inferred from AFM images; alternatively palindromic 
dimers can be used. However, an uncertainty remains on 
the two orientations with mirror curvature profiles that 
describe DNA adsorption on chemically different faces. 
Scipioni et ah (20) and Sampaolese et al. (21) demon- 
strated that such orientations are not statistically equiva- 
lent if palindromes are deposited onto freshly cleaved 
mica, because a preferential adsorption of T-rich faces 
occurs. This unexpected phenomenon ultimately justifies 
the calculation of (Cj >m ) from an ensemble of palindromic 
dimers, or even labeled chains properly orientated to have 
the same polarity. 

The alternative method described in this article 
characterizes the DNA intrinsic curvature through the 
calculation of a new statistical quantity, which couples 
two segments symmetrically placed along the sampled 
contours. In detail, we focus on the Pj >m product 
between the curvatures Q jff! and Cjv-i-(/t- m ),»! of two 
tracts formed by m units and located at j units from 
chain ends (Figure 2b): 

Pj,m — ^jftn ' ^N—l—(J+fn) t m (~) 

For the case of non-overlapping fragments 
(j = 1, 2, . . . , N/2, m= 1, . . . , N/2 -j), the quantity P L ,„ 
is normally distributed with average value: 

Pj,m) = \Cj,m ' C/v_i_(/+,„) i ,„) = (C,;,,,) ■ (CV-l-(j+m),m) (4) 

where (C/ ;ff! ) is given by Equation (2) and the last equality 
holds under the specific conditions of the present investi- 
gation (see Supplementary Equation S4 and the related 
discussion in the Supplementary Data). Equation (4) 
shows that (Pj, m ) equals the product of the intrinsic curva- 
tures of the two chosen segments. It is trivial to demon- 
strate that the realizations of the statistical chain 
descriptor P 7>m do not depend on the orientation arbitrar- 
ily assigned to the DNA trajectories extracted from AFM 
images, in other words P jim is centrosymmetric. 
Accordingly, (Pj, m ) can be estimated by an ensemble 
average of C /jf „ ■ Cjv-i-(/fm),»i values obtained from a 
large pool of molecular profiles with arbitrary relative 
orientation. In agreement with this picture, neither 
end-labeled molecules nor palindromic constructs are 
required. 

In particular, we propose to characterize the DNA in- 
trinsic curvature by introducing the curvilinear distance 
s = jl and by plotting s versus (P s ,l) at fixed L (L = ml), 
which corresponds to probe the emergence of intrinsic 
curvature effects for pairs of segments of fixed length L, 
located at a given distance s from the ends. By definition, 
we expect to observe remarkable variations of (P s ,l) 



whenever large, intrinsic curvatures affect the trajectory 
of the chosen fragments, whereas (P s ,l) ^ 0 for pairs 
with negligible intrinsic curvature at least at one of the 
two involved segments. Overall, these features contribute 
to generate a characteristic pattern of variation of (P s ,l) 
that can be originally exploited to set up the comparative 
analysis of bent duplexes (Figure 2c). 

The reader is referred to the Supplementary Data to 
get further insight into the generality and flexibility of 
the present method. As a matter of fact, a whole class 
of new centrosymmetric curvature descriptors can be 
introduced to probe bent chains under label-free condi- 
tions. For the case of P s> £, we anticipate that s versus 
(-P.s.a) patterns are fully comparable — in terms of 
accuracy and sensitivity — with the maps of signed intrinsic 
curvature provided by the current experimental strategies. 
Moreover, simple solutions exists in order to circumvent 
the decrease of sensitivity that affects the sites where 
{Ps,l) ^ 0, e.g. to cause a lateral shift of the centre of 
symmetry of the target molecules through ad hoc dele- 
tions of short fragments at one of the chain ends. 
Complementary patterns from two descriptors can be 
also exploited to bypass such a drawback. 

We finally note that there is certainly a loss of informa- 
tion in the s versus (P s ,l) plot with respect to the s versus 
(C. Sj i) data (2-4,19-21). This loss unavoidably arises from 
coupling pairs of DNA tracts into the definition of the 
statistical chain descriptor P um . Nevertheless, we argue 
hereafter that our choice readily provides a number of 
important advantages, overcoming some fundamental 
and practical limitations of early protocols. In fact the 
novel method can be easily implemented on label-free 
molecules, therefore specimens preparation is merely 
reduced to standard protocols for DNA deposition onto 
atomically smooth substrates. Furthermore, no assump- 
tions are done on the adsorption mechanism and prefer- 
ential orientation of target chains on a given substrate 
(otherwise required to calculate the conformational 
average(C s /.) (3,20,21)). The s versus (P s ,l) plots are also 
prone to an effective comparison with theoretical models 
[used to predict the right-hand term of Equation (4)] that 
impart access to the physics of DNA adsorption and 
sequence-dependent curvature. All these aspects are 
deeply explored in the following paragraphs, starting 
from the characterization of the intrinsic curvature of 
DNA molecules from the promoter region of the human 
OPN coding gene. 

AFM imaging and analysis 

Samples were imaged in air at room temperature and hu- 
midity with a Dimension 3100 AFM equipped with the 
closed-loop Hybrid XYZ scanner and the Nanoscope 
IVa control unit (Digital Instruments, Veeco). The AFM 
was operated in tapping mode and silicon probes 
(OMCL-AC 1 60TS, Olympus) were used. The AFM 
images were collected with a dimension of 1024 x 1024 
pixels and a typical scan size of 2 \xm. 

Our image-analysis software allowed a semi-automatic 
reconstruction of molecular trajectories and a straightfor- 
ward analysis of the signed curvature associated to 
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segments of given location and length. The tracing algo- 
rithm was developed in Lab View (National Instruments) 
following the general guidelines of ref. (24) and molecules 
were represented as chains of xy pairs separated by a 
contour length 1—2 nm (see Supplementary Data for 
further details). The positive values for the signed 
bending angles 6, were arbitrarily assigned to clockwise 
rotations, i.e. if by progressing along the trajectory the 
chain turns to the right at 8j. The signed curvatures Cj >m 
were estimated from Equation (1), whereas the average 
product (Pj, m ) was evaluated from the conformational 
average of the Cj <m ■ C N ~ i -(/+,„),,„ product over a given 
set of AFM-imaged molecular profiles. 

We also implemented standard checks on global statis- 
tical parameters to ascertain the thermodynamic equili- 
bration of the molecules during the deposition process 
onto the mica surface and to investigate the influence of 
intrinsic curvature on the average superstructure of the 
chain. According to the WLC model, the mean trajectory 
of an intrinsically straight chain (0° = 0 at every i location) 
is given by the following equation: 



= 4^+2^-4-1) 



(5) 



where R s S+L is the Euclidean distance between pairs of 
points located at „s and s+L from one end of the 
molecule (here L increases up to the limit of the chain 
length) (23,24). The average () is computed over s — up 
to the upper limit of the contour length of the molecule 
minus the contour length spacing L — and overall observed 
contours. We measured (R* on a large ensemble of 
traced contours and discussed deviations of such 
quantity from predictions of Equation (5). 

The automated fragment flipping algorithm 

We compared the curvature analysis based on s versus 
[Ps,l) plots with the predictions of the first label-free 
method for the automated intrinsic curvature computa- 
tion, originally proposed by Ficarra et al. (19). In this 
case, the angles 6} extracted from AFM images are 
arranged into a curvature matrix in a way that each row 
represents the curvature profile of a given molecule. Since 
curvature profiles are loaded without any knowledge of 
molecules relative orientation, an automated algorithm 
iteratively 'flips' each row (by inverting the sign, reversing 
the order, combining both operations or leaving the row 
unchanged) in search for the optimal matrix configur- 
ation, defined as the condition in which we can observe 
the minimal values for columns variances; this is in fact 
recognized to represent the case in which all the molecules 
share the same orientation. Once the mean value of 
columns variance achieves a minimum, the columns 
averages are expected to provide the intrinsic DNA curva- 
ture profile, in agreement with the WLC model stating 

{o,) = {e^+ef 1 ) = 0°. 

Initially, a custom-written code (LabView, National 
Instruments) implementing the automated fragment 
flipping (FF) algorithm was validated on a set of 
computer-generated and intrinsically bent chains, as ex- 
plained in ref. (19) (see also the Supplementary Data). 



The same FF code was then used to reconstruct the in- 
trinsic curvature profile and the CP patterns for the 
human OPN coding gene, starting from an ensemble of 
molecular profiles extracted from AFM images. 

Modeling DNA intrinsic curvature and adsorption 

Model chains representing the average three-dimensional 
(3D) shape of DNA specimens were generated by the 
3DNA software (26) exploiting nearest-neighbor, static 
dinucleotide wedge models (27,28). 

We custom developed an algorithm (LabView, National 
Instruments) that flattens the 3D model chain to simulate 
deposition. Briefly, it divides the chain into a discrete 
number of fragments originally lying on different planes 
and projects them individually. The output is a 2D chain 
formed by the geometric projections connected at their 
ends according to local continuity criteria. This procedure 
assumes that the 3D — > 2D transformation takes place at 
the expense of few local twists of the molecular backbone; 
as a consequence it reasonably implies a minimum 
increase of the conformational energy of the flattened 
molecule with respect to the 3D counterpart. 

The algorithm was implemented as follows. Geometric 
projection starts at one of the 3D chain ends and involves 
the longest fragment that can be projected onto a best fit 
plane while maintaining its overall fluctuations (relative to 
that plane) below a given threshold. Once such fragment is 
found, the algorithm is iterated on the remaining part of 
the 3D chain until the whole curve is flatted onto a unique 
set of preferential planes. The threshold value is chosen to 
match the typical range of chain — surface interaction 
forces, i.e. few nanometres. 

The results of the above algorithm for the target 
DNA were found to be consistent with those obtained 
by a different theoretical approach, originally proposed 
by Scipioni et al. (20). 



RESULTS AND DISCUSSION 

Characterization of local intrinsic curvature for the 
human OPN coding gene 

After samples preparation, a quantitative AFM analysis 
of molecular profiles was routinely performed in order to 
test the reproducibility of imaging conditions, evaluate 
relevant deviations of adsorbed DNA superstructure 
from the canonical B-form and get deeper insight on the 
influence of intrinsic curvatures on the local and global 
geometrical properties of the traced contours. Typically, 
measured DNA molecules displayed an average width of 
~10nm and a height of 0.8 — 1.0 nm, due respectively to 
AFM probe convolution effects and to the elastic deform- 
ation of the soft molecule under the repulsive forces 
exerted by the scanning tip (29). Molecules surface 
density was in the range 2—5 fim^ 2 . The analysis of the 
contour lengths for a large number of traced molecules 
(~400) attested a DNA contraction of 5% with respect 
to the B-form. This corresponds to a helix rise per base 
pair of 0.32 nm in excellent agreement with results of 
similar studies (1,8,10,23,30,31). In Figure 3a we report 
a representative high resolution topography of the target 
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Figure 3. (a) Representative AFM topography of the target DNA. 
It shows the persistence of bends at few locations along the molecular 
backbone — marked by arrows — suggesting the presence of a significant 
intrinsic curvature at the same places. In the inset is the histogram of 
contour lengths, (b) Comparison of the experimentally measured 
end-to-end distance curve with the WLC model predictions for linear 
(red) and bent (black) chains. The chosen specimen reveals a small but 
systematic decrease of {R 2 S+L ) at curvilinear distances above 250 nm, 
ascribed to an overall coiling of the chains with respect to linear DNA 
of comparable length. The WLC simulations on bent chains are in 
excellent agreement with experimental data at all curvilinear distances, 
confirming the key role played by intrinsic curvature. 



DNA. As expected, it reveals the large variety of shapes 
assumed by DNA under the thermal stochastic perturb- 
ation of its molecular environment. By visual inspection 
however, one can already notice the persistence of bends 
at a few sites, namely in close proximity of both ends and 
within the central region of the chain. This fact suggests 
the presence of non-null intrinsic curvatures at the same 
places. 

The first quantitative evidence for the influence of local 
intrinsic curvatures on the global conformation of the 
chains emerged when we measured the mean-squared 
end-to-end distance (I? 2 for an ensemble of 160 mo- 
lecular profiles extracted from several AFM topographies. 
As shown in Figure 3b, (i? 2 s+i ) shows a good agreement 
with the WLC model for L < 200nm, as attested by the 
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Figure 4. Characteristic patterns of variation of the average CP for the 
human OPN coding gene, estimated for three different L values. Circles 
indicate the main positive and negative peaks along each profile: they 
represent pairs of L-long segments with average curvature oriented 
respectively in the same or in the opposite direction. 



effective interpolation of experimental data with Equation 
(5) in that range; in particular we estimated f = 52 nm that 
agrees with the DNA flexibility reported by other AFM 
experiments (8,23,24) and proves the thermodynamic 
equilibration of chains on mica for the investigated 
samples. Notably, a small but systematic decrease of 
(7? 2 s+l ) with respect to the WLC predictions for linear 
chains takes place for L > 250 nm. A similar behavior 
has been reported by Rivetti et al. (1) for chains with 
in-phase A-tracts and by Moreno-Herrero et al. (8) for 
strands with hyperperiodic sequences, and can be con- 
sidered a signature of the presence of intrinsic curvatures 
that force DNA to assume (on the average) a more 
compact coil structure compared with linear DNA of the 
same length. We confirmed the correctness of that picture 
a posteriori, by comparing experimental data with WLC 
predictions for bent chains; the intrinsic curvature profile 
of the ensemble was evaluated through the use of the 
wedge model of De Santis et al. (28) (see also next para- 
graph). These arguments led us to rule out a substantial 
impact of excluded volume effects on the measured 
(23). 

Corroborated by such findings, we performed a refined 
characterization of the intrinsic curvature along the DNA 
contour by implementing the novel method; in particular, 
we explored three different contour lengths L = 17 nm 
(50 bp), 34 nm (100 bp) and 51 nm (150 bp) over the same 
ensemble of 160 profiles. The obtained patterns of vari- 
ation are contrasted in Figure 4. We observe clear oscilla- 
tions of .v versus (P s ,l) curves for each L value that confirm 
once more the presence of intrinsic curvatures along the 
studied contours and can be used to locate the most sig- 
nificant bending sites of the molecular backbone. 
According to Equation (4), the main positive and 
negative peaks of (P s ,l) mark the curvilinear positions of 
symmetric pairs of segments with the largest intrinsic 
curvature. In particular, we recognize three main peaks 
of 0.05 — 0.1 rad 2 for .?<70nm that concern pairs of 
segments close to the contours ends, and a large 
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negative peak of 0.1— 0.2 rad 2 for s ph 150 — 175nm, 
which on the contrary regards pairs of tracts located 
around the middle portion of the strands. In the range 
s «s 70 — 130 nm, the 5 versus (P s ,l) curves are almost com- 
pletely flat and (P s ,l) ^ *X which means that at least one of 
the two symmetrically placed segments presents a negli- 
gible intrinsic curvature. Noteworthy, the curvilinear pos- 
itions of the main peaks of (P s> A in Figure 4 are in good 
qualitative agreement with the visual inspection of DNA 
bends from several AFM topographies (see e.g. Figure 
3a). 

We show in the next paragraphs that the patterns of 
variation of Figure 4 depend on the nucleotide sequence 
and the adsorption mechanisms. For simplicity, we focus 
on the contour lengths L = 17nm and 34 nm that provide 
patterns of variation free from AFM tip convolution arti- 
facts; in fact these are expected to affect the experimental 
CP values whenever L becomes comparable with (or 
smaller than) the DNA apparent width (~10nm, see 
Supplementary Data). The substantial lack of novel 
features in the CP pattern for L = 51 nm further justifies 
our interest for the shorter contour lengths. 

Patterns of variation of the average product of curvatures: 
experiment versus theory 

A theoretical model suitable for the interpolation of the 
experimental results of Figure 4 should in principle 
account for the sequence-dependent static curvature of 
DNA and chain dynamics during adsorption and the sub- 
sequent surface relaxation, and should as well provide in- 
dications on the most important parameters governing the 
reorganization of superstructure under realistic experi- 
mental conditions. This is certainly a complex task since 
the long-range van der Waals forces and the short-range 
double-layer ones that control the adsorption process, 
apart from inducing an adjustments of DNA segments 
positions in order to adopt the equilibrium distance 
from the surface, can also tune the appearance of 
out-of-equilibrium, long-lived alterations of chain archi- 
tecture, including kinks, over(under)twists, local B to A 
transitions and even melting (6,31). As a result adsorption 
can dramatically affect the standard chain geometry and 
statistics, as already demonstrated by a number of works 
based on Monte-Carlo (MC) and molecular dynamics 
simulations (24,32-34). 

Whereas the implementation of a comprehensive model 
for the adsorption of an intrinsically curved DNA is out of 
the scopes of the present article, we note that the need for 
a straightforward comparison of electron microscopy and 
AFM data with theoretical models often lead to the 
practice to treat the average 3D shape of DNA by 
means of nearest-neighbor, static dinucleotide wedge 
models and reduce adsorption to a simple geometric pro- 
jection of the 3D trajectory onto one or more preferential 
planes (3,4,18). This solution is of course prone to errors, 
only partially mitigated by taking into account bent and 
approximately planar DNA molecules (4). Nevertheless, it 
can be considered a first-order approximation to the 
analysis of the intrinsic curvature profile of any 3D 
chain geometry. For such reasons the same solution was 
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Figure 5. (a) Representation of the average shape of the human OPN 
coding gene according to the static dinucleotide wedge model of De 
Santis et al. (b) 2D trajectory of the DNA obtained by flattening the 
3D model of (a) and twisting a portion of the backbone region (see also 
Supplementary Data). 

adopted in the present case. In doing this we recognize 
that the obtained theoretical framework provides an 
oversimplified picture of the real DNA structure and 
dynamics, as recently confirmed by the results of extensive 
molecular dynamics simulations [e.g. ref. (35) and refer- 
ences therein]. On the other side, experimental results 
(1-6,8,18-21) demonstrated that modeling adsorption in 
terms of projections onto best-fit planes results in a final 
DNA configuration that satisfyingly approximates the 
actual, average conformation of equilibrated adsorbed 
chains. 

The 3D intrinsic structure of the target DNA was 
investigated, respectively, by means of the wedge models 
of De Santis et al. (28) and Bolshoy et al. (27) that already 
demonstrated a good agreement with the DNA intrinsic 
curvature data accessed by AFM imaging (3,4). In such 
case, the local static curvature is computed by summation 
of the differential deviation angles of the helix axis at 
individual dinucleotide steps and the average shape and 
intrinsic curvature profile of DNA in bulk solution are 
readily obtained from mere knowledge of the whole 
nucleotide sequence. 

Visual inspection of the superstructure predicted by the 
model of De Santis et al. for the human OPN coding gene 
reveals the presence of local bends that extend over several 
helix turns and clearly impart a 3D shape to the studied 
strand (Figure 5a). 

The result of the 3D — > 2D transformation mimicking 
deposition is shown in Figure 5b (see also Supplementary 
Data for details). There is an astonishing resemblance of 
the 2D chain with several AFM-imaged molecules, as 
already attested by comparing Figures 5b and 3a. 

The 2D trajectory of Figure 5b was used to simulate the 
room temperature bending of DNA, describing chain 
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Figure 6. (a) Representative conformations of six chains generated by MC methods from the intrinsic 2D trajectory predicted by De Santis et al 
model for the target DNA. A randomly flat substrate has been intentionally added to generate topographies resembling as close as possible those 
obtained by AFM (b) Theoretical pattern of variation of the CP for two different values of sliding windows length L; encircled are the main peaks of 
the plots. Experimental results are reported for comparison with L = 34 nm. (c) Intrinsic curvature of the 2D trajectory predicted by De Santis et al 
model, with marked positions of the pairs of segments of length L = 17nm (50 bp) related to the peaks highlighted in (b). Positions are shown for 
clarity also on the 2D chain, (d) Comparison of experimental results with the theoretical pattern of variation of the CP predicted with the model of 
Bolshoy et al. 



lateral motion onto the mica surface. To this purpose, it 
was sampled at the spacing / WLC = 0.32 nm (correspond- 
ing to the experimentally found helix rise per base-pair) 
and thermal effects (on bending) were implemented 
by adding to the angles among neighbor segments a 
fluctuation chosen by a MC method from normally 
distributed numbers with mean zero and variance of 
Avix/f (£=52nm). The new trajectories were 
superimposed on a randomly flat substrate (roughness 
0.1 nm) and dilated by a parabolic tip (36) in order to 
generate topographies resembling as close as possible 
those obtained by AFM (Figure 6a). These were finally 
analyzed with the tracing algorithm in order to assure a 
bias — due to random and systematic angular distortions — 
comparable to that affecting experimental data. In Figure 
6b, we report the obtained results for the two sliding 
windows of size L = 17 nm and L = 34 nm, respectively. 

The theoretical s versus (P s ,l) curves are characterized 
by marked oscillations that persist at similar curvilinear 
distances for both L values. In particular, for L = 17nm 
three negative peaks of « 0.10 — 0.25 rad~ (conventionally 
named 1, 3 and 5) appear at 1 s , i=25nm, .V3 = 55nm 
and ss = 1 68 nm, whereas smaller local maxima 
(named 2 and 4) occur at S2 = 43 nm(« 0.07 rad") and 
s 4 = 95 nm(« 0.04 rad 2 ), respectively. A similar trend is 



found for L = 34 nm. Consistent with Equation (4), local 
peaks of s versus (P s ,l) plots are related to pairs of 
segments with large intrinsic curvature. This is confirmed 
by a direct inspection of the static curvature profile of the 
projected chain, reported in Figure 6c: here we properly 
highlighted the L-long tracts involved in the calculation of 
{P s ,l) at the sites si, . ..,5s, demonstrating that each one 
of them holds appreciable curvatures 0j, whose magnitude 
varies in the range 0.02 — 0.10 rad. Notably, the model can 
trace each tract back to its base-pairs content, therefore 
the local peaks point out those parts of the primary 
sequence that impart local and persistent nanoscale curva- 
tures to the adsorbed chains. 

We explored the dependence of model predictions from 
the chosen set of dinucleotide parameters by performing a 
new data analysis based on the model of Bolshoy et al. 
(27). The latter originates from a large body of curvatures 
data from circularization and gel electrophoresis mobility 
experiments, whereas the De Santis et al. model is primar- 
ily based on theoretical calculations of the minimum 
energy structure of the DNA strand successively refined 
to improve the correlation with experimental results. The 
theoretical pattern for L = 34 nm is reported in Figure 6d: 
here we observe the same peaks of Figure 6b at similar 
curvilinear positions but with an appreciable variation of 
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their magnitude. This confirms the largely accepted 
opinion that one dinucleotide model is as good as 
another in determining the structure and mechanical 
properties of DNA in bulk solution (25) and supports 
the conclusion that peaks curvilinear positions are consist- 
ently predicted by our analysis with negligible dependence 
on the chosen model. On the contrary, the estimated amp- 
litude of the [P s> /,) peaks is sensibly affected by the specific 
dinucleotide parameters set and by the flattening process 
mimicking DNA adsorption. Such fact suggests to exploit 
the comparison of experimental and theoretical CP 
patterns on DNA model systems in order to systematically 
contrast the response of several, well-known DNA 
bending models proposed so far (25). 

The five peaks at .vi , . . . , S5 can be exploited to drive the 
comparison between theory and experiment. Assuming 
L = 34 nm for simplicity, we recognize peak 1 also in the 
experimental data, biased by a small horizontal shift 8s i 
of about 8nm. The shifts for the remaining peaks are 
negligible compared to the positional errors (<5nm) af- 
fecting the molecular trajectories extracted from the tip- 
convoluted AFM images (Figure 6d). The protocol 
adopted for samples preparation is certainly contributing 
to the observed discrepancy at s\ . In particular, the hori- 
zontal shifts Ss\ might be ascribed to a structural reorgan- 
ization of adsorbed DNA at one or both ends, involving 
local variations of the helix rise, nanosized deletions or 
out-of-equilibrium alterations that are not properly 
resolved by AFM imaging and that can be induced by 
sample drying (31). Moreover, the reduced magnitude of 
the peaks in the experimental pattern with respect to the 
theoretical counterpart (mostly at si and ss) may be 
attributed to the rinsing the samples with pure water 
after DNA adsorption on mica: this step in fact reduces 
the ionic strength of the solution and consequently 
enhances the electrostatic repulsion of charged phosphate 
groups. A net decrease of the absolute curvature of the 
already adsorbed molecules is therefore highly probable 
(20,21). 

The overall satisfying agreement shown in Figure 6d is 
achieved also when contrasting data with predictions 
based on De Santis et al. model, but with slightly different 
8$i values (bottom panel of Figure 6b). In view of such 
results, we recognize that our analysis effectively describes 
the relevant features of the patterns of variation 
introduced by the new method with a simple and sound 
theoretical framework. We are able to predict the curvi- 
linear position and amplitude of the main local peaks in a 
s versus (P s ,l) plot and if necessary find out those parts of 
the primary sequence that impart a persistent bending to 
the target DNA. Due to the consistent response offered by 
several dinucleotide and trinucleotide bending models 
(25), the peaks curvilinear positions show robustness 
against variations of the angular parameters, thus they 
can be used to compare model predictions with experi- 
mental data as well as to gain a deeper insight into the 
physical processes characterizing DNA adsorption. A 
quantitative measure of the amount of error between the 
individual wedge models and experimental data is also 
settled by introducing the residual sum of squares (RSS). 
We find RSSoeSantis ^ 2.3 rad 2 and RSS Bo ishoy ^ 1-Orad 2 



due to the better interpolation of the amplitude of 
peaks 1 and 5 offered by the Bolshoy et al. model. 
Nevertheless, the two models are comparable in the 
restricted range 30 nm < s < 1 30 nm (peaks 2, 3 and 4) 
where RSS D eSantis ^ RSS B oishoy ^ 0.15 rad 2 . 

Comparison of the novel method with the automated 
FF algorithm 

The intrinsic curvature profile of the human OPN coding 
gene was reconstructed by applying the FF algorithm to 
an ensemble of 100 molecular profiles extracted from 
AFM images. Figure 7a contrasts the reconstructed 0° s 
profile with De Santis et al. model predictions. The two 
profiles show very comparable features, in particular two 
regions of large curvature at s » 50 nm and s « 170 nm, 
respectively, (peaks 2 and 5) and a well-defined sequence 
of smaller local peaks at similar curvilinear positions 
(peaks 8-12). 

The reconstructed 6% values were used to calculate (P s ,l) 
by means of Equations (2) and (4), and the result was 
finally compared to that obtained with our method (that 
gives a virtually exact (P s ,l) value in that it works directly 
on the ensemble average of P S>L realizations over 
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Figure 7. (a) The intrinsic curvature profile estimated by the FF algo- 
rithm is compared with the corresponding curve predicted by the the- 
oretical approach described in previous subsection, (b) The CP pattern 
computed from the intrinsic curvature profile in (a) is compared with 
the CP profile directly estimated on experimental DNA trajectories by 
our protocol. 
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experimental trajectories). As shown in Figure 7b, there 
is a reasonable agreement of the two data sets for 
20 nm < s < 50 nm and s > 100 nm, whereas appreciable 
discrepancies occur in the two regions 0 < s < 20 nm 
and 50 nm < s < 100 nm. This fact demonstrates that the 
FF algorithm fails to recover the whole intrinsic curvature 
information displayed by the experimental CP pattern, 
with the exception of the position and magnitude of the 
two main peaks located, respectively, at s « 30 nm and 
s « 160 nm. 

To this purpose, we report few notes of caution on the 
use of the FF algorithm. Its response sensibly depends on 
the relative position and orientation of the rows (i.e. the 
experimental curvature profiles) arranged into the starting 
curvature matrix, and in fact we observed relevant devi- 
ations in the 0° magnitude according to the chosen initial 
conditions. This ambiguity is due to the well-known pos- 
sibility for a hill-climbing optimization routine to provide 
solutions representing local minima of the objective 
function (here the mean value of columns variance) 
instead of the global one (19). For such reason, we 
reported in Figure 7a the profile with the smallest mean 
value of columns variance from a set of 10 curvature 
profiles, obtained by iteratively shifting the starting 
point of the automated FF algorithm within the ordered 
ensemble of molecular profiles. It is thus not surprising 
to see that there is — generally speaking — a systematic un- 
certainty in the accuracy of 9® profiles and CP patterns 
predicted by means of the FF algorithm. Moreover, the 
pit-fall of local minima stops is likely to become highly 
probable for very large sets of molecular trajectories. A 
second crucial drawback of the FF algorithm consists in 
the fact that it does not give indication on the alignment of 
the reconstructed curvature profile with respect to chain 
polarity (i.e. the 5' — 3' direction), which complicates any 
comparison of experimental results with theoretical 
models and definitely hampers the implementation of 
assays contrasting the profiles of a large number of 
samples. As mentioned earlier, this serious limitation is 
encompassed by the introduction of the statistical chain 
descriptor P s L that leads to an orientation-independent 
description of local intrinsic curvature through the CP 
pattern. 

A case study: sensitivity of CP patterns to point mutations 
in the OPN encoding gene 

In view of the excellent response offered by our method in 
terms of robustness, accuracy and flexibility, we foresee 
several challenging applications for the CP patterns. To 
this purpose, we first note that the typical target specimens 
should consist of 10 2 -10 3 bp long chains: these are readily 
deposited by standard protocols on atomically smooth 
substrates and can be routinely imaged by AFM 
(1-13,15-21). High-resolution AFM (8,24) is mandatory 
in order to explore the intrinsic curvature of shorter chains 
(<10 2 bp) and achieve reproducible estimates of the 
angular parameters for fragments as small as 15 bp 
(L = ~5 nm). Secondly, we emphasize that the applicabil- 
ity of the new method goes definitely beyond the case of 
the OPN encoding gene. In fact we demonstrated above 



that any DNA strand with a non-zero s versus [C S A 
profile is described as well by a CP profile with compar- 
able accuracy and sensitivity. This follows directly from 
the definition of the CP descriptor [Equations (3), (4), 
Supplementary Equations (S10), (Sll)]. Moreover, we 
carried out additional simulations on two model 
systems, namely 500 bp random sequences and the 937 
bp EcoRV-Psfl fragment of pBR322 DNA (see Supple- 
mentary Data for details). The obtained results attest that 
DNA templates that do not contain prominent nucleotide 
sequences responsible for large bends are nevertheless 
characterized by informative CP patterns. All together 
such arguments make us confident of the applicability of 
our method to a broad class of intrinsically bent duplexes. 

One interesting possibility of application might regard 
the systematic use of CP maps to deeply explore the pre- 
dictions of DNA adsorption and bending models. 
An insight into this topic was provided in the sections 
above and significant improvements are expected to 
come from state of the art modeling (as Brownian 
dynamics and molecular dynamics simulations) going 
beyond the nearest-neighbor approximation in conform- 
ational analysis or describing the non-equilibrium 
processes of DNA adsorption and relaxation on the atom- 
ically flat substrate (24,35,37,38). For example, a tight 
comparison of experimental and theoretical CP patterns 
might allow to identify the presence of restricted regions 
where out-of-equilibrium alterations of the chain architec- 
ture systematically take place during adsorption. This in- 
formation might be eventually related to the local base 
pairs sequence and/or exploited to tune DNA adsorption 
according to the needs of novel comparative essays. 
Another challenge might involve the use of CP patterns 
to routinely detect small conformational changes in large 
sample numbers. The capability to relate DNA structural 
variations to physical or biological causes (e.g. mutations 
at one or more base-pairs) might eventually contribute to 
develop new assays and even genetic screening protocols 
for highly bent duplexes. Interestingly, some studies might 
explore the ultimate sensitivity of CP patterns to point 
mutations and mismatched base-pairs and largely contrib- 
ute to the discovery of physical methodologies for molecu- 
lar haplotyping (16,39). Within this context we offer a 
concrete example on the CP patterns sensitivity to single 
nucleotide polymorphisms (SNPs) in the OPN encoding 
gene. In detail, we contrast two homozygous specimens 
having different SNPs at four, well-known polymorphic 
sites. To date, there is a well documented functional effect 
of such SNPs on the OPN gene transcriptional activity 
(22), and they play a useful role as genetic markers to 
characterize patients with oligoarticular juvenile idio- 
pathic arthritis (40). In Figure 8a, we show the 3D 
model chains predicted for the two specimens. It appears 
that the insertion (or deletion) of an individual G base 
at the sequence site 762 (marked by the vertical arrows) 
dramatically affects the whole DNA bending close to the 
centre of the chain, in fact inducing a variation of 
the relative orientation of the 5' half with respect to the 
3' end. This is confirmed by the corresponding CP 
patterns, evaluated through theory and experiment as 
described above. In particular, Figure 8b attests the 
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Figure 8. (a) Representation of the average 3D shape of two homozy- 
gous samples according to the wedge model by De Santis et al. The 
SNP at the sequence site 762 (marked by arrows) impacts the overall 
relative orientation of the 5' half with respect to the 3' end, whereas the 
other three SNPs do not substantially affect the DNA shape, (b) Top: 
theoretical patterns of variation of the CP for the two chains in (a) with 
L = 34 nm. Bottom: experimental patterns of variation for the CP. 
Gray regions highlight statistically relevant differences between the 
two specimens, in excellent agreement with theoretical predictions. 



emergence of statistically relevant differences for the ex- 
perimental CP values at four main regions of the curvilin- 
ear distance s (highlighted in gray). The experimental 
pattern of the 1335 bp specimen also shows less marked 
amplitude variations with respect to the 1332 bp counter- 
part. A detailed analysis of the overall fluctuation of the 
CP signal for different L values (50-120 bp) confirms that 
this feature systematically occurs in both theory and ex- 
periment (see Supplementary Data); it thus represents a 
robust sequence-dependent property of the samples that is 
successfully captured by the CP method. We underline 
that the most relevant message of Figure 8b is to 
document the practical feasibility of the label-free com- 
parative analysis envisaged in Figure 2c. Such an essay 
represents the crucial advantage offered by the CP 
method with respect to other conformational methods 
and is reported in the present investigation — to the best 
of our knowledge — for the first time. 

We strongly believe that the possibility to routinely 
achieve similar results, through the use of symmetric 
curvature descriptors operating under label-free condi- 
tions, should boost the applicability of AFM conform- 
ational analysis in novel, genetic screening tests. 



We finally note that further attractive developments 
might come from the evaluation of CP patterns to 
address the structural properties of DNA fragments com- 
plexed with intercalating dyes and binding drugs (12,41) 
or even proteins. In fact the CP patterns might be quite 
useful to complement current AFM studies on the forma- 
tion of protein-DNA complexes [e.g. ref. (42)], where the 
position distribution of protein binding along unlabeled 
DNA fragments is calculated relative to the closest 
DNA terminus. Indeed this choice statically couples 
binding events occurring on symmetrically placed tracts, 
in analogy with the curvatures coupling contained in the 
P Si l definition. As a result, a visual correlation of s versus 
(P s ,l) an d .v versus protein-binding-frequency plots would 
easily point out the existence of helix sites where local 
intrinsic curvature drives the so called 'indirect 1 DNA rec- 
ognition or competes with other binding mechanisms 
(25,43). This is certainly of dramatic interest for funda- 
mental investigations addressing the ability of proteins to 
locate specific sites or structures among a vast excess of 
non-specific, intrinsically bent DNA, as in the relevant 
case of mismatch repair proteins interrogating DNA to 
find out biosynthetic errors and promote strand-specific 
repair (11). 



CONCLUSIONS 

In this article we proposed a novel method to characterize 
the local intrinsic curvature of adsorbed DNA molecules 
by AFM. It relies on the fine mapping of a statistical chain 
descriptor that highlights all pairs of intrinsically bent 
segments symmetrically placed along the helix chain. 
This peculiar choice provides a number of advantages 
overcoming some fundamental and practical limitations 
of early protocols. It is in fact well known that such proto- 
cols generate intrinsic curvature maps starting from the 
contours of end-labeled molecules or palindromes and 
those conformational averages are carried on under the 
assumption that a preferential DNA adsorption takes 
place. More importantly, none of the current methods is 
expected to readily manage comparative assays involving 
a large number of samples. On the contrary, we demons- 
trated, both theoretically and experimentally, that the 
novel method can be implemented on label-free molecules 
with unknown orientation, in fact reducing specimen 
preparation to standard procedures for DNA deposition. 
Accordingly, neither end-labeled molecules nor palin- 
dromic constructs are strictly required and no a priori as- 
sumptions or additional evidences on the DNA 
adsorption mechanisms are necessary. Experimental un- 
certainty affecting the new curvature patterns is compar- 
able to that already discussed in early works and derives 
from AFM tip convolution effects and the specific algo- 
rithm used for DNA tracing from AFM topographies. 

We therefore conclude that the novel method paves the 
way for a reliable, unbiased, label-free comparative 
analysis of bent duplexes, aimed to detect local conform- 
ational changes of physical or biological relevance in 
large sample numbers. To this purpose, we suggested 
few relevant examples that should boost the applicability 
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of AFM-based curvature studies, e.g. validating DNA 
adsorption and bending models by experiments, uncover- 
ing DNA interactions with proteins, intercalating dyes 
and drugs, setting up population-based genetic disease 
studies or solving genomic screening problems at the 
single-molecule level. 

SUPPLEMENTARY DATA 

Supplementary Data are available at NAR Online: 
Supplementary Table SI, Supplementary Figures S1-S9, 
Supplementary Materials and Methods and Supple- 
mentary Equations Sl-Sll. 
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